inference procedure
A Structured Prediction Approach for Generalization in Cooperative Multi-Agent Reinforcement Learning
Effective coordination is crucial to solve multi-agent collaborative (MAC) problems. While centralized reinforcement learning methods can optimally solve small MAC instances, they do not scale to large problems and they fail to generalize to scenarios different from those seen during training. In this paper, we consider MAC problems with some intrinsic notion of locality (e.g., geographic proximity) such that interactions between agents and tasks are locally limited. By leveraging this property, we introduce a novel structured prediction approach to assign agents to tasks. At each step, the assignment is obtained by solving a centralized optimization problem (the inference procedure) whose objective function is parameterized by a learned scoring model. We propose different combinations of inference procedures and scoring models able to represent coordination patterns of increasing complexity. The resulting assignment policy can be efficiently learned on small problem instances and readily reused in problems with more agents and tasks (i.e., zero-shot generalization). We report experimental results on a toy search and rescue problem and on several target selection scenarios in StarCraft: Brood War, in which our model significantly outperforms strong rule-based baselines on instances with 5 times more agents and tasks than those seen during training.
- North America > United States > New Jersey > Hudson County > Secaucus (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
Inference of Human-derived Specifications of Object Placement via Demonstration
Cuellar, Alex, Siu, Ho Chit, Shah, Julie A
As robots' manipulation capabilities improve for pick-and-place tasks (e.g., object packing, sorting, and kitting), methods focused on understanding human-acceptable object configurations remain limited expressively with regard to capturing spatial relationships important to humans. To advance robotic understanding of human rules for object arrangement, we introduce positionally-augmented RCC (PARCC), a formal logic framework based on region connection calculus (RCC) for describing the relative position of objects in space. Additionally, we introduce an inference algorithm for learning PARCC specifications via demonstrations. Finally, we present the results from a human study, which demonstrate our framework's ability to capture a human's intended specification and the benefits of learning from demonstration approaches over human-provided specifications.
- Europe > Austria > Vienna (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
- Europe > United Kingdom > England > Hertfordshire (0.04)
- (2 more...)
- Research Report > New Finding (0.46)
- Research Report > Experimental Study (0.46)
- Government > Military (0.68)
- Government > Regional Government > North America Government > United States Government (0.67)
we will publish both the data and code if the paper is accepted---this was an oversight by us for not making clear we
We thank the reviewers for their thoughtful reviews and below we address their major concerns. This variability would be expected even from different recording sessions for the same subject. This allows researchers to add multiple covariates (e.g., different experimental Also related to Reviewer 1's comments, it is certainly possible to have different numbers Another major point/question raised by the reviewers was the sensitivity of our results to our intialization procedure. It is not necessary but it simplifies the inference derivation.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > Canada (0.04)
- North America > United States > New York > Suffolk County > Stony Brook (0.05)
- North America > United States > Arizona > Maricopa County > Phoenix (0.05)
- North America > United States > Texas > Travis County > Austin (0.04)
- (2 more...)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
- North America > United States (0.14)
- Europe > Portugal > Castelo Branco > Castelo Branco (0.04)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Health Care Technology (0.67)
- Health & Medicine > Diagnostic Medicine (0.67)
we will publish both the data and code if the paper is accepted---this was an oversight by us for not making clear we
We thank the reviewers for their thoughtful reviews and below we address their major concerns. This variability would be expected even from different recording sessions for the same subject. This allows researchers to add multiple covariates (e.g., different experimental Also related to Reviewer 1's comments, it is certainly possible to have different numbers Another major point/question raised by the reviewers was the sensitivity of our results to our intialization procedure. It is not necessary but it simplifies the inference derivation.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)